5 research outputs found

    Novel Natural Language Processing Models for Medical Terms and Symptoms Detection in Twitter

    Get PDF
    This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected 7.5 million data from August 2015 to March 2016. This work leveraged a longstanding, multidisciplinary collaboration between researchers at the Population & Center for Interventions, Treatment, and Addictions Research (CITAR) in the Boonshoft School of Medicine and the Department of Computer Science and Engineering. In addition, we aimed to develop and deploy an innovative prediction analysis algorithm for eDrugTrends, capable of semi-automated processing of Twitter data to identify emerging trends in cannabis and synthetic cannabinoid use in the U.S. In addition, the study included aim four, a use case study defined by tweets content analyzing PLWH, medication patterns, and identifying keyword trends via Twitter-based, user-generated content. This case study leveraged a multidisciplinary collaboration between researchers at the Departments of Family Medicine and Population and Public Health Sciences at Wright State University’s Boonshoft School of Medicine and the Department of Computer Science and Engineering. We collected 65K data from February 2022 to July 2022 with the U.S.-based HIV knowledge domain recruited via the Twitter API streaming platform. For knowledge discovery, domain knowledge plays a significant role in powering many intelligent frameworks, such as data analysis, information retrieval, and pattern recognition. Recent NLP and semantic web advances have contributed to extending the domain knowledge of medical terms. These techniques required a bag of seeds for medical knowledge discovery. Various initiate seeds create irrelevant data to the noise and negatively impact the prediction analysis performance. The methodology of aim one, PatRDis classifier, applied for noisy and ambiguous issues, and aim two, DsOn Ontology model, applied for semantic parsing and enriching the online medical to classify the data for HIV care medications engagement and symptom detection from Twitter. By applying the methodology of aims 2 and 3, we solved the challenges of ambiguity and explored more than 1500 cannabis and cannabinoid slang terms. Sentiments measured preceding the election, such as states with high levels of positive sentiment preceding the election who were engaged in enhancing their legalization status. we also used the same dataset for prediction analysis for marijuana legalization and consumption trend analysis (Ohio public polling data). In Aim 4, we applied three experiments, ensemble-learning, the RNN-LSM, the NNBERT-CNN models, and five techniques to determine the tweets associated with medication adherence and HIV symptoms. The long short-term memory (LSTM) model and the CNN for sentence classification produce accurate results and have been recently used in NLP tasks. CNN models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. We propose attention-based RNN, MLP, and CNN deep learning models that capitalize on the advantages of LSTM and BERT techniques with an additional attention mechanism. We trained the model using NNBERT to evaluate the proposed model\u27s performance. The test results showed that the proposed models produce more accurate classification results, and BERT obtained higher recall and F1 scores than MLP or LSTM models. In addition, We developed an intelligent tool capable of automated processing of Twitter data to identify emerging trends in HIV disease, HIV symptoms, and medication adherence

    Predicting Public Opinion on Drug Legalization: Social Media Analysis and Consumption Trends

    No full text
    In this paper, we focus on the collection and analysis of relevant Twitter data on a state-by-state basis for (i) measuring public opinion on marijuana legalization by mining sentiment in Twitter data and (ii) determining the usage trends for six distinct types of marijuana. We overcome the challenges posed by the informal and ungrammatical nature of tweets to analyze a corpus of 306,835 relevant tweets collected over the four-month period, preceding the November 2015 Ohio Marijuana Legalization ballot and the four months after the election for all states in the US. Our analysis revealed two key insights: (i) the people in states that have legalized recreational marijuana express greater positive sentiments about marijuana than the people in states that have either legalized medicinal marijuana or have not legalized marijuana at all; (ii) the states that have a high percentage of positive sentiment about marijuana is more inclined to authorize (e.g., by allowing medical marijuana) or broaden its legal usage (e.g., by allowing recreational marijuana in addition to medical marijuana). Our analysis shows that social media can provide reliable information and can serve as an alternative to traditional polling of public opinion on drug use and epidemiology research

    Predicting Public Opinion on Drug Legalization: Social Media Analysis and Consumption Trends

    No full text
    In this paper, we focus on the collection and analysis of relevant Twitter data on a state-by-state basis for (i) measuring public opinion on marijuana legalization by mining sentiment in Twitter data and (ii) determining the usage trends for six distinct types of marijuana. We overcome the challenges posed by the informal and ungrammatical nature of tweets to analyze a corpus of 306,835 relevant tweets collected over the four-month period, preceding the November 2015 Ohio Marijuana Legalization ballot and the four months after the election for all states in the US. Our analysis revealed two key insights: (i) the people in states that have legalized recreational marijuana express greater positive sentiments about marijuana than the people in states that have either legalized medicinal marijuana or have not legalized marijuana at all; (ii) the states that have a high percentage of positive sentiment about marijuana is more inclined to authorize (e.g., by allowing medical marijuana) or broaden its legal usage (e.g., by allowing recreational marijuana in addition to medical marijuana). Our analysis shows that social media can provide reliable information and can serve as an alternative to traditional polling of public opinion on drug use and epidemiology research
    corecore